CLAIMS 



1 . An electronic device, comprising: 
5 a processor for executing an operating system program and a media content 

presentation program; 

a media content pickup device operatively connected to said processor, said 

media content pickup device captures media content input, and said media content 

pickup device focuses the media content input on a user-specified region of interest; 
10 and 

a media output path to receive and carries the focused media content. 



2. An electronic device as recited in claim 1 , wherein the user-specified region of 
interested is specified by a user through interaction with a graphical user interface. 

3. An electronic device as recited in claim 2, wherein the graphical user interface 
is provided by a media content presentation program that is executed by said 
processor. 



20 4. An electronic device as recited in claim 2, wherein said media output path 
carries the focused media content to be provided to a media output device, the 
media output device being part of said electronic device or separate from said 
electronic device. 



25 5. An electronic device as recited in claim 4, 

wherein said media output device is a monitor, 

W h ere j n the graphical user interface is displayed on said monitor, and 

wherein the graphical user interface includes at least a media content display 
window. 
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6. An electronic device as recited in claim 5, wherein the user-specified region of 
interested is specified by the user with reference to the media content display 
window. 

5 

7. An electronic device as recited in claim 4, wherein said media output device is 
a monitor. 

8. An electronic device as recited in claim 4, wherein said media output device is 
10 at least one speaker. 

9. An electronic device as recited in claim 1 , wherein the media content input is 
at least one of audio content or video content. 

15 10. An electronic device as recited in claim 1 , wherein said media content pickup 
device is at least one of a camera and a plurality of microphones. 

11. An electronic device as recited in claim 1 , wherein said electronic device is 
one of a mobile telephone, a personal computer, a personal digital assistant, and a 

20 handheld computer. 

12. A computer system, comprising: 

a processor for executing a video application program; 

a camera operatively connected to said processor, said camera captures 
25 video input in accordance with its field of view, and said camera focuses the video 
input on a determined region of the field of view, the determined region being 
determined in accordance with a user input; and 

a data output means operatively connected to said processor, said data 
output means operates to provide the focused video input for display. 
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13. A computer system as recited in claim 12, wherein said processor receives a 
user input that indicates the determined region of the field of view. 

5 14. A computer system as recited in claim 13, wherein the user input is with 
respect to a window displayed on said display. 

15. A computer system as recited in claim 14, wherein the user input is a user 
selection of a region of the window. 

10 

16. A computer system as recited in claim 12, wherein said computer further 
comprises: 

at least one microphone for sound pickup. 

15 17. A computer system as recited in claim 1 6, wherein the video application 
program is an audio-video application, and wherein said processor receives the 
sound pickup from said at least one microphone and supplies audio output to a 
speaker. 

20 18. A computer system as recited in claim 17, wherein the speaker is coupled to 
and associated with said computer. 

19. A computer system as recited in claim 12, wherein said computer further 
comprises: 

25 a plurality of microphones for sound pickup, said microphones having a 

known positional relationship to one another, 

wherein said microphones are integral with said camera. 
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20. A computer system as recited in claim 19, wherein said processor receives 
audio input from each of said microphones and processes the audio input to 
emphasize audio sound from the determined region that has been determined in 
accordance with the user input. 

5 

21 . A method for altering a focus location for a camera coupled to a computing 
apparatus, said method comprising: 

receiving video input from the camera; 

receiving an identification of a focus region; and 

10 causing the camera to focus on the focus region. 

22. A method as recited in claim 21 , including the further step of displaying the 
video input in a video viewing window of a monitor. 

15 23. A method as recited in claim 22, wherein the user specifies the focus region 
by selecting an area of the video viewing window. 



24. A method as recited in claim 23, wherein the user moves a curser image over 
the video viewing window using a pointing device to an area of interest, and then 

20 selects the focus region by clicking on the area of interest. 

25. A method as recited in claim 24, wherein the user performs a button press to 
select the focus region. 



25 26. A method as recited in claim 25, wherein the button press is with respect to a 
pointing device. 

27. A method as recited in claim 26, wherein the pointing device is a mouse, 
trackball or a trackpad. 
APL1P281/P3101 21 



28. A method as recited in claim 22, wherein the user moves a position reference 
image over the video viewing window using a pointing device to an area of interest, 
and then selects the focus region by clicking on the area of interest. 

5 

29. A method as recited in claim 21 , wherein the focus region is an area of 
interest specified by the user. 

30. A method as recited in claim 21 , wherein said receiving of the audio input is 
10 supplied from a first computing apparatus to a second computing apparatus, and 

said displaying of the video input and said receiving of the focus region are 
performed on the second computing apparatus. 

31 . A method as recited in claim 21 , wherein the computing apparatus is one of a 
15 mobile telephone, a personal computer, a personal digital assistant, and a handheld 

computer. 

32. A method for using a computing apparatus to process audio input provided by 
a plurality of microphones, said method comprising: 

20 receiving audio input from the plurality of microphones; 

receiving an indication of a region of interest from a user with respect 
to a graphical user interface window being displayed on a monitor available to the 
user; and 

processing the audio input to focus the audio input towards the region of 
25 interest. 
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33. A method as recited in claim 32, wherein said method further comprises: 
outputting the processed audio input to at least one speaker. 

34. A method as recited in claim 33, wherein said method further comprises: 

5 repeating the foregoing operations after said outputting has output the 

processed audio input to the at least one speaker. 

35. A method as recited in claim 32, wherein said processing captures audio from 
the region of interest while attempting to reject audio from other regions. 

10 

36. A method as recited in claim 32, wherein said processing utilizes beam 
forming and beam steering operations. 

37. A method as recited in claim 32, wherein a camera couples to the computer, 
15 and 

wherein the camera has a housing and the microphones are internal to the 
housing of the camera. 

38. A method as recited in claim 32, wherein the user performs a button press to 
20 select the region of interest. 

39. A method as recited in claim 38, wherein the button press is with respect to a 
pointing device. 

25 40. A method as recited in claim 39, wherein the pointing device is a mouse, 
trackball or a trackpad. 
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41 . A method as recited in claim 32, wherein the user moves a position reference 
image over the graphical user interface window using a pointing device to an area of 
interest, and then selects the region of interest by clicking on the area of interest. 

42. A method as recited in claim 32, wherein said receiving of the audio input is 
supplied from a first computing apparatus to a second computing apparatus, and 
said displaying of the graphical user interface window and said receiving of the 
indication of the region of interest are performed on the second computing 
apparatus. 

43. A method as recited in claim 32, wherein the computing apparatus is one of a 
mobile telephone, a personal computer, a personal digital assistant, and a handheld 
computer. 

44. A video conferencing system operable over a network, said video 
conferencing system comprising: 

a first computer system including at least a first processor for executing a first 
operating system program and a first video application program, a first camera to 
capture first video input, and a first monitor; and 

a second computer system operatively connectable to said first computer 
system via the network, said second computer system including at least a second 
processor for executing a second operating system program and a second video 
application program, a second camera to capture video input, and a second monitor; 

wherein when said first computer system and said second computer system 
are involved in a video conference, said first monitor displays the second video input 
provided by said second camera via the network, and said second monitor displays 
the first video input provided by said first camera via the network, and 

wherein when a first user interacts with a first graphical user interface 
presented on said first monitor to select a region of interest with respect to the 
second video input, said second camera then focuses itself so that the second video 
input is focused on the region of interest. 
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45. A video conferencing system as recited in claim 44, wherein the first graphical 
user interface includes at least a window presented on said first monitor, the window 
containing the second video input provided by said second camera. 

5 

46. A video conferencing system as recited in claim 45, wherein the first user 
interfaces with the first graphical user interface by moving a graphical indicator over 
the window to identify the region of interest and then indicating its selection. 

10 47. A video conferencing system as recited in claim 44, 

wherein said first computer system further includes at least a first plurality of 
microphones and a first speaker, 

wherein said second computer system further includes at least a second 
plurality of microphones and a second speaker, 

15 wherein second audio input obtained by said second plurality of microphones 

is provided to said first computer system via the network and then output to said first 
speaker, 

wherein first audio input obtained by said first plurality of microphones is 
provided to said second computer system via the network and then output to said 
20 second speaker, and 

wherein said second multimedia computer system performs processing on the 
second audio input based on the region of interest selected by the first user, 
whereby the second audio input is processed so as to emphasize audio sound from 
the region of interest. 

25 

48. A video conferencing system as recited in claim 44, 

wherein said first plurality of microphones are internal to a housing of said first 
camera, and 
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wherein said second plurality of microphones are internal to a housing of said 
second camera. 

49. A computer readable medium including at least computer program code for 
5 directing media content input, said computer readable medium comprising: 

computer program code for receiving media content input from a media 
content capturing device; 

computer program code for receiving a user-specified region of interest for the 
media content input; 

10 computer program code for processing a media content input into processed 

media content based on a user-specified region of interest; and 

computer program code for providing the processed media content to a 
device. 

15 50. A computer readable medium as recited in claim 49, wherein said device is a 
monitor, and 

wherein the processed media content produces a graphical user interface on 
said monitor. 

20 51 . A computer readable medium as recited in claim 50 wherein the graphical 
user interface includes at least a media content display window. 

52. A computer readable medium as recited in claim 51 , wherein the user- 
specified region of interested is specified by the user by selecting a region within the 

25 media content display window. 

53. A computer readable medium as recited in claim 49, wherein the media 
content input is at least one of audio content or video content. 
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54. A computer readable medium as recited in claim 49, wherein said media 
content pickup device is at least one of a camera and a plurality of microphones. 

55. A computer readable medium as recited in claim 49, wherein the media output 
device is a monitor and/or at least one speaker. 
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